[feat] Enable MLA chunked prefill and KV cache reuse on SM121 by CodersAcademy006 · Pull Request #15347 · NVIDIA/TensorRT-LLM

CodersAcademy006 · 2026-06-14T04:20:21Z

I updated the SM version check logic in py_executor_creator.py to allow MLA chunked prefill and KV cache block reuse on SM121 (Blackwell) architectures. Specifically, I added SM121 (121) to the validation lists for both enable_block_reuse and enable_chunked_context checks, preventing the executor from automatically disabling these features on Blackwell GPUs.

This resolves issue #15344, where these optimization features were being disabled on SM121 devices because the Python-side validator was missing SM121 in its allowlist. The underlying C++ kernels already support SM121, so this change enables full compatibility for MLA optimizations on this hardware.

coderabbitai · 2026-06-14T04:21:58Z

📝 Walkthrough

Walkthrough

In py_executor_creator.py, SM121 is added to the GPU SM version allowlists for two MLA feature guards inside create_py_executor: MLA KV cache block reuse and MLA chunked prefill. Both the conditional checks and their associated warning messages are updated to reflect SM121 as a supported architecture.

Changes

MLA SM121 allowlist expansion

Layer / File(s)	Summary
SM121 added to MLA KV cache reuse and chunked prefill guards `tensorrt_llm/_torch/pyexecutor/py_executor_creator.py`	The SM version allowlist for the MLA KV cache block reuse guard and the MLA chunked prefill guard are each extended to include `121`, and both warning messages are updated to list SM121 as a supported SM version.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title '[feat] Enable MLA chunked prefill and KV cache reuse on SM121' accurately summarizes the main change: adding SM121 support to two MLA optimization features.
Linked Issues check	✅ Passed	The changes fully satisfy issue `#15344` objectives: SM121 is added to both MLA KV block reuse and chunked prefill allowlists as required.
Out of Scope Changes check	✅ Passed	All code changes in the PR are directly related to the linked issue `#15344`; the modifications only update the SM version allowlists as specified in the objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	The PR description provides clear context about the changes, references issue `#15344`, and explains the rationale for enabling SM121 support for MLA features.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Signed-off-by: Srijan Upadhyay <srjnupadhyay@gmail.com>

karljang · 2026-06-22T17:22:26Z

@CodersAcademy006 ,
Thanks for putting this together.
However, we're going to land #15434 instead. Nothing wrong with the approach here; we just want one PR with tests rather than two near-identical ones. Appreciate you flagging the SM121 gap!
I'm closing this in favor of #15434

CodersAcademy006 requested a review from a team as a code owner June 14, 2026 04:20

CodersAcademy006 requested a review from joyang-nv June 14, 2026 04:20

github-actions Bot assigned CodersAcademy006 Jun 14, 2026

feat: Enable SM121 support for MLA chunked prefill and KV block reuse

156e45b

Signed-off-by: Srijan Upadhyay <srjnupadhyay@gmail.com>

CodersAcademy006 force-pushed the feat/sm121-mla branch from 65577f3 to 156e45b Compare June 14, 2026 19:02

Merge branch 'main' into feat/sm121-mla

2c586ac

karljang closed this Jun 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] Enable MLA chunked prefill and KV cache reuse on SM121#15347

[feat] Enable MLA chunked prefill and KV cache reuse on SM121#15347
CodersAcademy006 wants to merge 2 commits into
NVIDIA:mainfrom
CodersAcademy006:feat/sm121-mla

CodersAcademy006 commented Jun 14, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

karljang commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

CodersAcademy006 commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

karljang commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CodersAcademy006 commented Jun 14, 2026 •

edited

Loading

coderabbitai Bot commented Jun 14, 2026 •

edited

Loading